Installation Guide¶

Refer to the installation guide below. Some SDK components, including the RBLN Compiler (rebel-compiler) and vllm-rbln, require an RBLN Portal account for installation. If you need assistance, please contact us.

1. RBLN Driver¶

Note

RBLN Driver is primarily intended for on-premise servers.
If RBLN NPU devices are visible (ls /dev/rbln*) on your server, you can skip the driver installation.

The RBLN Driver contains the Linux kernel driver and firmware, enabling the OS to recognize RBLN NPU devices. It is pre-installed on most cloud servers.

Key Features¶

Kernel Driver & Firmware: Enables the OS to interface with the RBLN NPU.
Package Formats: Available as Ubuntu (.deb) and RedHat (.rpm) packages.

Installation¶

Ubuntu
1
$ sudo dpkg -i rbln_driver.deb
RedHat
1
$ sudo rpm -i rbln_driver.rpm

Additional Notes¶

Root privileges are required for installation on on-premise servers.
If you need .deb or .rpm files, please contact us.

2. RBLN Compiler¶

The RBLN Compiler is the core component of the RBLN SDK, used to convert pre-trained models into an NPU-executable format. It also provides runtime environments (Python and C/C++) and profiling tools.

Note

A RBLN Portal account required for installation.

Key Features¶

Compile API: Converts pre-trained models into RBLN NPU-executable formats.
Runtime API:
1. Python runtime: Installed via a .whl package.
2. C/C++ runtime: Requires GPG key registration and apt-based installation.
  See C/C++ runtime installation for details.
Profiler Support: Offers performance analysis and optimization with the RBLN Profiler.

Installation¶

Distributed as a .whl package. Install using pip:

$ pip3 install -i https://pypi.rbln.ai/simple/ rebel-compiler

3. RBLN Optimum (HuggingFace Model Support)¶

optimum-rbln integrates HuggingFace APIs, making it easy to compile pre-trained transformers and diffusers models to run on RBLN NPUs.

Key Features¶

HuggingFace Integration: Seamlessly supports transformers and diffusers for RBLN-based inference.
Easy Deployment: Simplifies model loading and optimization for RBLN NPUs.

Installation¶

Distributed as a .whl package:

$ pip3 install optimum-rbln

4. RBLN Model Zoo¶

RBLN Model Zoo provides ready-to-use examples for compiling and running pre-trained models on RBLN NPUs. It serves as a reference for adapting custom models.

Key Features¶

Pre-trained Models: Contains a diverse collection of scripts for various popular pre-trained models.
Implementation Guides: Offers step-by-step instructions on how to develop model compilation and execution scenarios using RBLN NPUs.

Installation¶

Hosted on GitHub. Clone the repository with:

$ git clone --recursive https://github.com/rebellions-sw/rbln-model-zoo.git

5. Serving Frameworks Support¶

RBLN NPUs integrate with popular serving solutions, including vLLM, Nvidia Triton Inference Server, and TorchServe.

Key Features¶

vLLM Support (vllm-rbln)
1. Custom vLLM solution for serving large language models (LLMs) on RBLN NPUs.
2. Distributed as a .whl package.
3. Requires an RBLN Portal account for installation.
Nvidia Triton Inference Server Support
1. Refer to Nvidia Triton Inference Server Support for configuration details.
TorchServe Support
1. Refer to TorchServe Support for installation and usage instructions.

Installation¶

vLLM (vllm-rbln)

$ pip3 install -i https://pypi.rbln.ai/simple/ vllm-rbln

Nvidia Triton Inference Server and TorchServe
1. Visit Nvidia Triton Inference Server and TorchServe documentation pages for instructions and integration details.

Congratulations on setting up the RBLN SDK. You can now run PyTorch and TensorFlow models on RBLN NPUs.

Explore Tutorials for further understanding on how to use the RBLN SDK.